Austrian Hotels Dataset
Overview
This dataset contains realistic simulated data on hotels across Austria, designed for practicing data wrangling and table joins. The dataset consists of multiple related tables that can be combined using various join operations.
Used in: Week 4 (Joining Tables)
Generated by: Claude AI (Sonnet 3.7) with realistic relationships between variables
Dataset Structure
The dataset includes 8 related tables with hotels across Austrian cities, covering occupancy, pricing, tourism statistics, and economic indicators.
Core Tables
File | Description | Rows | Key Columns |
---|---|---|---|
hotels.csv |
Basic hotel information | 200 | hotel_id (PK) |
cities.csv |
City information | 10 | city (PK) |
monthly_occupancy.csv |
Monthly hotel performance metrics | ~3,800 | hotel_id , month , year |
city_tourism.csv |
Monthly tourism statistics by city | 240 | city , month , year |
economic_indicators.csv |
Monthly economic indicators | 24 | month , year |
reviews.csv |
Hotel guest reviews | ~1,700 | review_id (PK), hotel_id (FK) |
amenities.csv |
List of possible hotel amenities | 10 | amenity_id (PK) |
hotel_amenities.csv |
Hotel-amenity relationships | ~1,000 | hotel_id , amenity_id |
Key Relationships
- One-to-One: Hotels ↔︎ Cities (through city name)
- One-to-Many: Hotels → Monthly Occupancy, Hotels → Reviews
- Many-to-Many: Hotels ↔︎ Amenities (through
hotel_amenities
) - Composite Keys: Monthly data requires
(hotel_id, month, year)
or(city, month, year)
Documentation
hotel-data-readme.md
- Detailed schema documentation with column descriptions and data types
Learning Objectives
This dataset allows students to practice: - Inner, left, right, and full joins - One-to-one and one-to-many relationships - Composite key joins - Data aggregation after joins - Handling missing values in joins
Sample Research Questions
- How do hotel prices vary by city and season?
- What’s the relationship between amenities and guest ratings?
- How do economic indicators affect hotel occupancy rates?
- Which cities have the highest tourism-to-hotel capacity ratios?